A Non-blocking Checkpointing Algorithm for Distributed Systems

نویسندگان

  • Liu Guoliang
  • Chen Shuyu
  • Zhang Xiaoqin
چکیده

The technology of checkpointing and rollback recovery as an effective method of fault tolerance, has been used widely on the parallel or distributed computer systems. We have presented a nonblocking coordinated checkpointing algorithm for distributed systems, which are differ from the conventional approach of taking first temporary checkpoints and then converting them to permanent ones by processes. The proposed checkpointing algorithm allows processes to take permanent checkpoints directly, without taking temporary checkpoints. The character of the algorithm contributes to its speed of execution. The orphan messages are eliminated by sender processes and the in-transit messages are eliminated by checkpointing interval and retransmission mechanism. While reducing the complexity of control message during gain checkpoints from O(n) to O(n), the algorithm’s controlling messages are reduced to n-1.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Enhanced MSS-based checkpointing Scheme for Mobile Computing Environment

Mobile computing systems are made up of different components among which Mobile Support Stations (MSSs) play a key role. This paper proposes an efficient MSS-based non-blocking coordinated checkpointing scheme for mobile computing environment. In the scheme suggested nearly all aspects of checkpointing and their related overheads are forwarded to the MSSs and as a result the workload of Mobile ...

متن کامل

Real Time Snapshot Collection Algorithm for Mobile Distributed Systems with Minimum Number of Checkpoints

Checkpointing is an efficient way of implementing fault tolerance in distributed systems. Mobile computing raises many new issues, such as high mobility, lack of stable storage on mobile hosts (MHs), low bandwidth of wireless channels, limited battery life and disconnections that make the traditional checkpointing protocols unsuitable for such systems. Minimum process non-blocking coordinated c...

متن کامل

A Fast And Efficient Non-Blocking Coordinated Checkpointing Approach For Distributed Systems

In this paper, we have presented an efficient non-blocking coordinated checkpointing algorithm for distributed systems. The distinct advantages of the proposed algorithm are the following. It produces a consistent set of checkpoints, without the overhead of taking temporary checkpoints; the algorithm also makes sure that only few processes are required to take checkpoints in its any execution; ...

متن کامل

A Non-blocking Minimum-process Checkpointing Protocol for Deterministic Mobile Computing Systems

The term Distributed Systems is used to describe a system with the following characteristics: i) it consists of several computers that do not share memory or a clock, ii) the computers communicate with each other by exchanging messages over a communication network, iii) each computer has its own memory and runs its own operating system. In the mobile distributed system, some of the processes ar...

متن کامل

An Application-Transparent, Platform-Independent Approach to Rollback-Recovery for Mobile Agent Systems

This paper proposes a new approach to rollback-recovery for mobile-agent systems, and describes its implementation in the MESSENGERS mobile agents system. The used checkpointing method allows to implement space and time efficient, user-transparent rollback-recovery in heterogeneous distributed environments. Together with an efficient non-blocking system snapshot algorithm this checkpointing met...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011